Grow Data Skills

Round 1: Screening Test (1 Hour – PySpark Coding)

Environment: Coding done in a virtual lab provided by the company.

Task: Complex data transformation using PySpark.

Difficulty: High-level — required chaining multiple transformations to reach the expected output.

Skills Tested:

- DataFrame operations

- Joins, window functions

- Handling nested structures, nulls, and schema enforcement

Round 2: Technical + Project Discussion (Face-to-Face)

✅ SQL (5 Questions – Hard Level)

Advanced SQL involving:

- Multiple joins

- Window functions (LAG, LEAD, NTILE)

- CTEs and nested queries

- Aggregations with filtering

✅ Project Discussion

Deep dive into past projects:

- Architecture

- Tooling (e.g., Spark, Delta Lake, Azure/AWS)

- Your role in data ingestion, transformation, and performance tuning

✅ PySpark (4 Coding Questions)

Real-world data manipulation using:

- groupBy, agg, window

- Conditional logic with when, otherwise

- Handling nulls and schema mismatches

✅ Spark Optimization Techniques

- Tuning Spark configurations for performance

Round 3: HR

- Salary discussion, location, etc.

By Grow Data Skills

Enroll Now

By Grow Data Skills

Enroll Now